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Abstract — Fundamental limits of the cognitive interference 
channel (CIC) with two pairs of transmitter-receiver have been 
under exploration for several years. In this paper, we study 
the discrete memoryless cognitive interference channel (DM- 
CIC) in which the cognitive transmitter non-causally knows 
the full message of the primary transmitter. The capacity of 
this channel is not known in general; it is only known in some 
special cases. Inspired by the concept of less noisy broadcast 
channel (BC), in this work we introduce the notion of less 
noisy cognitive interference channel. Unlike BC, due to the 
inherent asymmetry of the cognitive channel, two different less 
noisy channels are distinguishable; these are named the primary- 
less-noisy and cognitive-less-noisy channels. We derive capacity 
region for the latter case by introducing inner and outer bounds 
on the capacity of the DM-CIC and showing that these bounds 
coincide for the cognitive-less-noisy channel. Having established 
the capacity region, we prove that superposition coding is the 
optimal encoding technique. 

I. Introduction 

A two-user interference channel (IC) is a network consist- 
ing of two transmitter-receiver pairs, communicating over 
the same channel, and thus interfering each other. In certain 
communication scenarios, e.g., cognitive radio, one transmit- 
ter (the cognitive transmitter) is able to sense the environment 
and obtain side information about the incumbent transmitter 
(the primary transmitter). Such a communication channel 
is called interference channel with cognition or simply the 
cognitive channel. Motivated by cognitive radio's promise 
for increasing the spectral efficiency in wireless systems, the 
study of interference channel with cognitive users has been 
receiving increasing attention during the past years. 

Fundamental limits of the cognitive interference channel, 
in which the cognitive transmitter non-causally knows the the 
full message of the the primary user, has been studied in [1]- 
[12]. This channel was first introduced in [1] where the au- 
thors obtained achievable rates by applying Gel'fand-Pinsker 
coding [15] to the celebrated Han-Kobayashi encoding [14] 
for the IC. The capacity of this channel remains unknown in 
general; however, it is known in several special cases, both 
in the discrete memoryless and Gaussian channels. 

Capacity of the Gaussian cognitive interference channel 
(GCIC) is known at low interference [2] and [3], as well 
as strong interference [4]. Besides, capacity of Gaussian 
cognitive Z-interference channel (GCZIC) in which the 
primary receiver is interfered by the cognitive transmitter 
is known for several ranges of interference gain [8] — [ 11]. 



While at low interference dirty paper coding [13] is capacity- 
achieving scheme, at high interference superposition cod- 
ing is the optimal technique. For the discrete memoryless 
channel, capacity is known for "strong interference" [4], 
"weak interference" [3], and "better cognitive decoding" [7] 
regimes. Effectively, superposition coding is the capacity- 
achieving technique in all above cases although several other 
techniques, including rate-splitting, simultaneous coding, and 
Gel'fand-Pinsker coding (binning) are used to find achievable 
rate regions. 

In this paper, we consider the discrete memoryless cog- 
nitive interference channel (DM-CIC). We first introduce 
the notion of less noisy DM-CIC and show that there are 
two different less noisy cognitive channels: the primary-less- 
noisy and cognitive-less-noisy DM-CIC. In the former, the 
primary receiver is less noisy than the secondary receiver, 
whereas it is the opposite in the latter. 

Afterward, we propose two inner bounds for the DM-CIC; 
one based on superposition coding, and another one using 
independent coding. Obviously, these inner bounds are also 
valid for less noisy DM-CIC; in fact, one of these inner 
bounds is more suitable for the primary-less-noisy DM-CIC 
whereas the other one is better for the cognitive-less-noisy 
DM-CIC. We also prove an outer bound on the capacity of 
this channel. 

Finally, we show that for the cognitive-less-noisy DM- 
CIC the inner and outer bounds coincide, and therefore we 
establish the capacity region for this class of DM-CIC. This 
proves that superposition coding is the capacity-achieving 
scheme in the less noisy DM-CIC, as it is in the less noisy 
BC. Although for the primary-less-noisy DM-CIC capacity 
remains unknown, corresponding inner bound simplifies to 
an achievable region that has already been proved to be 
capacity-achieving in the special case of GCZIC [8], [10]. 

This paper is organized as follows. In Section [II] we 
introduce the system model and define the less noisy DM- 
CIC. In SectionHnJ we propose an outer bound and two inner 
bounds for the DM-CIC. Then, in Section iTVl we show that 
one of the inner bounds is tight for the cognitive-less-noisy 
channel, and thus provides capacity for this class of the DM- 
CIC. New capacity result is compared with the existing ones 
in Section IVl 
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Fig. 1, The discrete memoryless cognitive interference channel (DM- 
CIC) with two transmitters and two receivers. Mi , M2 are two messages, 
Xi , X'i are inputs, Y\ . Y2 are outputs, and p(yi, 3/2 \xi , ^2) is the transition 
probability of channel. 



II. Problem Setup and Definitions 

The two-user discrete memoryless cognitive interference 
channel (DM-CIC) is an interference channel [16] that 
consists of two transmitter-receiver pairs, in which one 
transmitter (the cognitive user) knows the message of the 
other transmitter (the primary one), in addition to its own 
message. In what follows, we formally define this channel 
and a special class of that. 

A. Discrete Memoryless Cognitive Interference Channel 

The discrete memoryless cognitive interference channel 
(DM-CIC) is depicted in Figure Q] Let Mi and M 2 be 
two independent messages which are uniformly distributed 
on the set of all messages for the first and second users, 
respectively. Transmitter i,i G {1,2}, wishes to transmit 
message Mi to receiver i, in n channel use at rate Ri. 
Message M 2 is available only at transmitter 2, while both 
transmitters know Mi. This channel is defined by a tuple 
(Xi, X 2 ;p{y 1 ,y2\xi,x 2 )- I yi,y2) where X X ,X 2 and y 1: y 2 
are input and output alphabets, and p(yi, y 2 \xi,x 2 ) is chan- 
nel transition probability density functions. 

The capacity of the DM-CIC is known in "strong inter- 
ference" [4], "weak interference" [3], and "better cognitive 
decoding" [7] regimes. These capacity results are listed in 
Table U and labeled C\, C 2 , and C3, respectively. In the first 
case, both receivers can decode both messages. In all above 
cases, the cognitive receiver has a better condition (more 
information) than the primary one, in some sense, as it is 
evident from corresponding conditions in Table []] 

B. Less Noisy DM-CIC 

Since the second transmitter has complete and non-causal 
knowledge of both messages, it can act like a BC transmitter. 
Particularly, in the absence of the first transmitter this channel 
becomes the well-known DM-BC [20]. In the presence of 
that, this channel is no longer a BC; however, one can 
define conditions, similar to that in the BC, showing that one 
receiver is in a "better" condition than the other to decode 
the messages, e.g., one receiver is less noisy or more capable 
than the other [18], [17]. 



In [8], [10], the authors extended this notion to the DM- 
CIC, and studied the case where the primary receiver is more 
capable than the secondary receiver. This led to the capacity 
of the GCZIC at very strong interference. In what follows, 
we introduce the notion of less noisy cognitive interference 
channel, and show that two different less noisy DM-CIC 
arises, depending on which receiver is in a better condition. 
These are formally defined in the following. 

Definition 1. The DM-CIC is said to be primary-less-noisy 
if 



JiUiYi) >I(U; Y 2 ) 



(1) 



for all p{u, xi,x 2 ). 



Definition 2. The DM-CIC is said to be cognitive-less-noisy 
if 



I(U;Y 2 )>I(U;Y{) 



(2) 



for all p(u, x\, x 2 ). 



It is clear that in the first case the primary receiver is less 
noisy than the cognitive receiver whereas in the second case 
the cognitive receiver is less noisy than the primary receiver. 
Therefore, given the channel condition, a DM-CIC can be 
primary-less-noisy, cognitive-less-noisy, neither of them or 
both. 

III. Inner and Outer Bounds for the DM-CIC 

In this section, we first introduce an outer bound on the 
capacity of the DM-CIC; we then derive two achievable 
rate regions for this channel. The first achievable region is 
based on superposition coding technique; it is inspired by the 
capacity-achieving superposition coding in the less noisy and 
more capable DM-BC, or the inner bound introduced for the 
more capable DM-CIC in [10]. The idea of outer bound also 
comes from the capacity of the less noisy DM-BC. However, 
we combine two different bounds to find a unified one. 

A. A Unified Outer Bound 

Inspired by capacity of less noisy BC [18], and definitions 
(Q]l and (0 for less noisy cognitive interference channels, we 
present a simple outer bound on the capacity of the DM-CIC. 
This outer bound is in fact a combination of two simpler 
outer bounds as we describe later in this section. Each outer 
bound can be tight in specific cases of less noisy DM-CIC, 
as it will be shown later. 

The following provides an outer bound on the capacity of 
the DM-CIC. 

Theorem 1. The union of rate pairs {R\,R 2 ) such that 

Ri<I{U;Y x ), (3) 

R2<I(V;Y 2 ), (4) 

Ri+R 2 <I(X 2 ;Y 2 \U) + I(U;Y 1 ), (5) 

R 1 +R 2 <I{X 1 ;Y 1 \V)+I(V;Y 2 ), (6) 

for some joint distribution p(u, v, x\, x 2 ) gives an outer 
bound on the capacity region of the DM-CIC. 



Proof. The proof of the second and last inequalities follows 
the same line of argument as in the outer bound of the more 
capable DM-CIC [10, Theorem 2], or similarly the converse 
of the more capable BC [17]. The other two inequalities, by 
symmetry, follow the same line of proof. The essence of the 
proof in (0 and © is to use the Csiszar sum identity and 
the auxiliary random variables Ui = (Mi, Y 2 , an d 
Vi = (M 2 , Yj- 1 , Y£ i+1 ). The choice of U it Vt indicates that 
they are correlated; hence, the outer bound is over the joint 
distribution p(u, v)p(xi,X2\u, v)p(yi,y 2 \xi 7 x 2 ). □ 

The symmetry of the outer bound indicates how it consists 
of two simpler outer bounds. One including @ and @, 
and the other including © and ([§}. Each outer bound is 
resembling the capacity of less noisy DM-BC [18]. 

B. New Achievable Rate Regions 

We next provide two achievable rate regions for the DM- 
CIC. The first achievable region uses superposition encoding 
at the cognitive transmitter whereas the second one encodes 
independently. The decoding is based on the joint typicality 
in both cases. 

Theorem 2. The union of rate regions given by 

Rx <I(W,X X ;Y X ), 

R 2 <I(X 2 ;Y 2 \W,X 1 ), (7) 

Rl+R 2 <I(X 1 ,X 2 ;Y 2 ), 

is achievable for the DM-CIC, where the union is over all 
probability distributions p(w,xi,x 2 ). 

Proof. The proof of Theorem|2]uses the superposition coding 
idea in which Y\ can only decode M\ while Y 2 is intended 
to decode both M\ and M 2 . Considering the space of all 
codewords, one can view the (W, X\) as cloud centers, and 
the X 2 as satellites [19]. For completeness, the details of the 
proof are provided in Section IVI-AI □ 

In light of the above discussion, we expect the encoding 
scheme in Theorem [2] be more favorable when the second 
receiver is in a better situation than the first one, because it 
can decode both cloud centers and satellites. If the channel 
condition is the reverse, i.e., the first receiver has a better 
situation than the second receiver, it makes sense to reverse 
the order of encoding. However, at the first transmitter, we 
cannot do superposition encoding against the codeword of 
the secondary transmitter because the first transmitter does 
not know the massage of the cognitive user. As a result, the 
input distribution needs to be independent as proposed in the 
following theorem. 

Theorem 3. The union of rate regions given by 
Ri < I(Xv,Y x \W,X 2 ), 
R 2 <I(W,X 2 ;Y 2 ), (8) 
R 1 +R 2 <I(X 1 ,X 2 ;Y 1 ), 

is achievable for the DM-CIC, where the union is over 
all probability distributions p(w,X\,x 2 ) that factors as 
p(w,x 2 )p(x 1 ). 



Proof. The proof of Theorem [3] uses independent encoding 
of X\ and (W,X 2 ); however, Y\ is intended to decode both 
messages whereas Y 2 can only decode M 2 . The proof of 
Theorem |3] follows a similar footsteps as Theorem |2] but the 
input distributions are different. The details of the proof can 
be found in Section IVI-BI □ 

IV. The Capacity of Less Noisy DM-CIC 

In this section, we simplify the inner bounds in Theorem|2] 
and Theorem [3] respectively for the cognitive-less-noisy and 
primary-less-noisy DM-CIC defined in (Q} and @. Then, 
by comparing the fist inner bound with the outer bound in 
Theorem [TJ we establish capacity region for the cognitive- 
less-noisy DM-CIC. 

A. The Cognitive-less-noisy DM-CIC 

Theorem 4. For the cognitive-less-noisy DM-CIC, the ca- 
pacity region is given by the set of all rate pairs (Ri,R 2 ) 
such that 



Ri <I(U;Y X ), 
R 2 < I(X 2 ;Y 2 \U), 



(9) 
(10) 



for some p(u, x 2 ). 

Proof. Consider the achievable region in Theorem [2] and 
define U — (W,Xi). From (f2]i we know that, for the 
cognitive-less-noisy DM-CIC, I(U; Yi) < I(U; Y 2 ). Then, it 
can be simply verified that, the third inequality in Theorem |2] 
becomes redundant for this channel. Thus, the achievability 
of the rate region in Theorem |4] immediately follows. To 
prove the converse, we consider inequalities (0 and (0 from 
the outer bound in Theorem Q] which are 

Ri <I(U; Yi), 
Ri+R 2 < I(X 2 ;Y 2 \U) + I(U;Y t ). 



(ID 



(12) 



Clearly, these two inequalities make an outer bound on 
the capacity of any DM-CIC for some joint distributions 
p(u, x\, x 2 )p(yi, y 2 \xi, x 2 ). An alternative representation of 
this outer bound is given by [18], [17], 

Rx < J(C;Yi), 
R 2 < I(X 2 ;Y 2 \U), 

which is equal to the achievable region given in Theorem |4] 
Hence, the rate region in Theorem |4] is the capacity of the 
cognitive-less-noisy DM-CIC. Note that the regions charac- 
terized by (fTTT i and (fT2l are not necessarily equal for fixed 
U, X\, however, their convex hull over all p(u, Xi) becomes 
the same. 

□ 

We further observe that the auxiliary random variable U 
in the capacity region, can be replaced by (W,Xi), which 
results in the following corollary. 

Corollary 1. The capacity region of the cognitive-less-noisy 
DM-CIC can be expressed as 



Rx </(W,Xi;Yi), 
R 2 <I(X 2 ;Y 2 \W,X 1 ), 



(13) 



for some p(w, xi,X2). 

Proof. The achievability of this region is obvious from 
Theorem |2 and the condition in ([2j. To prove the converse, 
we use the last two constraints of the outer bound in [3, 
Theorem 3.2], which are (note the reversal of indices), 

Ri < /(WiX^Yi), 
R 1 +R 2 <I(X 2 ;Y 2 \W,X 1 )+I(W,X 1 ;Y 1 ), " 4) 

for some p(w, x\, x 2 ). However, with a similar argument 
used in the proof of Theorem |4] the outer bound in (IT4l can 
be alternatively represented as the constraints in ([T3V □ 

The capacity-achieving technique in Theorem [4] is the 
well-known superposition coding, similar to that in the less 
noisy BC [18]. Superposition coding has been proved to be 
optimal encoding in several other cases, both for the DM- 
CIC (see Table B and the GCZIC [10]. 

B. The Primary-less-noisy DM-CIC 

One may expect a similar result for the primary-less-noisy 
DM-CIC, by applying the corresponding condition in (fl~|i to 
the rate region in Theorem [3] However, since Theorem [3] 
holds only for independent x\ and x 2 , capacity region cannot 
be established in general. Instead, we can write 

Corollary 2. The union of all rate pairs (Ri,R 2 ) satisfying 

Ri <I{Xi\Yi\V), (15) 
R 2 <I(V;Y 2 ), (16) 

over all probability distributions p(v,xi,x 2 ,yi,y 2 ) that 
factors as p(v)p(x 2 )p(yi,y 2 \xi,x 2 ) is achievable for the 
primary-less-noisy DM-CIC. 

Proof. By symmetry, the proof of this theorem follows the 
same line of argument as the proof of Theorem |4] To 
prove the achievability, define V — (W,X 2 ) and apply 
the condition of the primary-less-noisy DM-CIC in ([TJ to 
Theorem |3j this makes the third inequality of Theorem [3] 
redundant and completes the proof of the achievability. □ 

Note that, from © and © a outer bound that resembles 
the rate region in Corollary |2] can be built, but this outer 
bound is over p(v, x 2 ) which is, in general, larger than the 
inner bound in Corollary [2] Nevertheless, in the following 
section we discuss that this region can result in capacity 
region for a particular channel. 

V. Comparison and Discussion 

In this section we compare the capacity region obtained 
in Theorem |4] with the existing capacity results for the DM- 
CIC. Table I summarizes the capacity results for the DM-CIC 
in the chronological order. 

We show that the capacity of the cognitive-less-noisy DM- 
CIC is a subset of the capacity region derived in [3], which 
is labeled as C 2 in Table U To this end, we first show that 
the condition (0 of the cognitive-less-noisy implies both 
conditions required for C 2 . First, since I(U ; Y\) < I(U ; Y 2 ) 
holds for any p(u,xi,x 2 ), it will result in I{Xi;Y{) < 



I(Xi;Y 2 ) for U = X\. The other condition is also achieved 
by the following lemma. 

Lemma 1. If I(U;Yi) < I(U;Y 2 ) holds for all joint 
distributions p(u,xi,x 2 ), then I(U;Yi\Xi) < I{U;Y 2 \Xi) 
for all p(u, x\ , x 2 ). 

Proof. See Appendix IVFCl □ 

Thus, the condition required for C4 is more demanding 
than that of C 2 . In other words, if the cognitive receiver, 
in a DM-CIC, is less noisy than the primary one, the DM- 
CIC will satisfy the "weak interference" conditions. Further, 
we observe that, for U — (U,Xi) the capacity regions 
C4 becomes the same as C 2 . This is also evident from 
Corollary Q] 

It is also worth mentioning that, for U = X\, with further 
assumption that I(X 2 ; Y 2 \Xi) < I(X 2 ; Yi\Xi), C4 becomes 
equivalent to C\. This indicates that we can use superposition 
coding to achieve the capacity of the DM-CIC in the "strong 
interference" regime. Note that, the capacity region in the 
"strong interference" (C\ in Table |I|, can be reexpressed as 

Ri < HXhYl), (17) 
R 2 <I(X 2 ;Y 2 \X 1 ). (18) 

In this setting, X\ and X 2 , respectively, can be viewed as 
cloud centers and satellites of superposition coding. Origi- 
nally, the achievability of C\ is proved by using the capacity 
of compound multiple accesses channels [5] which is based 
on transmitting private and common messages. 

It should be highlighted that, the technique used to achieve 
C3 is also effectively superposition coding although it is 
derived (simplified) from a scheme that uses rate-splitting, 
binning, and superposition coding collectively. This can be 
verified by looking at the simplified encoding in the proof 
of the achievability in [7]. Therefore, we can see that all 
capacity results in Table H] (C\ — C4) can be achieved using 
superposition coding Q 

Finally, consider the primary-less-noisy DM-CIC. The 
condition required for this channel is rather different from 
that in all other cases that we know the capacity region, and 
listed in Table Q] To appreciate this, from Table H] one can 
see that in all those cases (C\ — C4) the cognitive receiver 
has, in some sense, more information than the primary 
one. Nevertheless, in a primary-less-noisy DM-CIC, the 
primary receiver is assumed to have more information than 
the cognitive receiver, as ([U implies. This condition could 
particularly arise in the cognitive Z-interference channel in 
which the link from the primary user to the cognitive receiver 
is absent. For example, one can verify that the capacity result 
for the GCZIC at very strong interference [10, Corollary 4] 
is the counterpart of Corollary |2] for Gaussian inputs. This 
is also shown independently in [11, Theorem V.2]. 

'We should emphasis that C3 is just a different representation of C2; this 
is because the conditions required for these two capacity regions are equal. 
This is proved in [21]. 



TABLE I 

Summary of the capacity results for the discrete memoryless cognitive interference channel 



Label 


Condition 


Capacity region 


Encoding 


Reference 


Ci 


I{X 1 ,X 2 :Y 1 ) < I(X 1 ,X 2 ;Y 2 ) 
I(X 2 ;Y 2 \X 1 )<I(X 2 ;Y 1 \X 1 ) 


Hi + R 2 < I{X 1 ,X 2 ;Y 1 ) 
R 2 < IiXr^Xi) 


supeiposition coding 


[4] 


c 2 


< I(X V ,Y 2 ) 
/(t/;Yi|Xi) < 7([/;Y 2 |Xi) 


Ri < /(L/,Xi;Yi) 
R 2 < I(X 2 ;Y 2 \U,Xi) 


supeiposition coding 


[3] 


c 3 


/(C/,Xi;Yi) < I(U,Xi;Y 2 ) 


Ri < /(L/,Xi;Yi) 
H 2 < I(X 2 ;Y 2 \X!) 
Ri+R2<I{U,Xr,Y 1 ) + I{X 2 ;Y 2 \U,X 1 ) 


rate-splitting, 
binning, and 
supeiposition coding 


[7] 




I(U;Yi) < I(U;Y 2 ) 
(cognitive-less-noisy DM-CIC) 


Ri<I(U;Yi) 
R2 < I{X 2 ;Y 2 \U) 


supeiposition coding 


Theorem |4] 



It should be emphasized that the technique used to achieve C3 effectively is supeiposition coding, although it is derived (simplified) from a scheme 
that uses rate-splitting, binning, and supeiposition coding. In fact, C3 is only a different representation C2, as shown in [21] 



VI. Appendix 
A. Proof of Theorem \2\ 

Proof. We prove this theorem by showing the code construc- 
tion, encoding, decoding, and error analysis. 

1) Code construction: Fix p(w,xi) and p(x 2 \w, x\). 
Randomly and independently generate 2 1 sequences 
(w n {m 1 ),x 1 {m 1 )), mi G [1 : 2 nRl ] i.i.d. according 
to Il"=i Pwx t (wi, xu). Next, for each sequence 
(w n (mi), ij(mi)), randomly and conditionally 
independently generate 2 nR2 sequences x 2 (mi, 7712), 
rri2 G [1 : 2 n 2 ], with i.i.d. elements according to 
Y[ r i=iPx 2 \wx 1 (x 2l \w l (mi)xi l {mi)). 

2) Encoding: To send messages (toi,TO2), the primary 
transmitter sends the codeword ij(rai) whereas the sec- 
ondary transmitter sends the codeword x^(mi, m 2 ). 

3) Decoding: We use joint typicality for decoding. The 
cognitive receiver (Y2) can decode both messages whereas 
the other receiver can only decode one of them, namely mi. 
Decoder 1 declares that message rhi is sent if it is the unique 
message such that (tu"(mi), x™(mi), y") G T^ n K Likewise, 
decoder 2 declares that message 777,2 is sent if it is the unique 
message such that (w n (mi), i*(mi), x 2 (mi, A2), y% ) G 
li n \ for some mi. In other cases, as analyzed below, the 
decoders declare error. 

4) Error Analysis: Without loss of generality, we assume 
that (M 1; M 2 ) = (1,1) is sent in order to analyze the 
probability of error. To evaluate the average probability of 
error for decoder 1, we define the following error events 

En = {W n {l),Xi{l),Y{ l )£T} n \ 

E12 = (W n (m 1 ),X^(m 1 ),Y 1 n ) G T} n) for mi ^ 1. 

Then, by using union bound, the probability of error for 
decoder 1 is upper bounded by 

P{Ei) = P(E n U £ X2 ) < P(E n ) + P(Ei 2 ). (19) 

But, P(En) — > as n — > 00, by the law of large numbers 
(LLN). Moreover, since for mi ^ 1, (W n (mi), X"(mi)) is 
independent of (^"(l), X"(l), Y™), by the packing lemma 
[18], P(E 12 ) -> as n -> 00 if Ri < I{W,Xr,Y x ) - 6(e), 



To evaluate the average probability of error for decoder 2, 
we define the following error events 

E21 =(W n (l),X{\l),X 2 l (l,l),Y 2 n ) £ r} n \ 
E22 =(W n (l),Xl l (l),X2(l,m2),Y 2 n ) G T} n) 

for some ni2 7^ 1, 
E 23 =(W n (mi),XY(mi),X^(mi,m 2 ),Y 2 n ) G 

for some mi ^ 1, m 2 ^ 1. 

Using union bound, the probability of error for decoder 1 is 
bounded as 

P(E 2 ) = P(E2iUE 2 2UE2 3 ) 

<P(E 2 i) + P(E 2 2) + P(E23). (20) 

Now, we evaluate the terms in the right-hand side (RHS) of 
this inequality when n — > 00. First, by the LLN P(E2i) — > 
as n — > 00. Then, for 777,2 7^ 1, X 2 (l, m 2 ) is conditionally 
independent of Y 2 n given (W n (l), Xf (1)). Thus, by the 
packing lemma P(E 22 ) — > as 77 —} 00 given R 2 < 
I(X2',Y2\W,Xi) — 6(e). Finally consider E23; for mi ^ 1 
and ?7i2 7^ 1, (W n (mi), Xi(mi), X 2 (mi,m 2 )) is inde- 
pendent of Y 2 . Again, by the packing lemma P(E23) — ► 
as n -> 00 if Ri + R 2 < I(W, X u X 2 ; Y 2 ) - 6(e) = 
I(Xi, X 2 ;Y 2 ) — 6(e); the equality follows since W — I 
Xi , X2 —> Y 2 forms a Markov chain. The proof of achiev- 
ability is completed by the above analysis. That is, if © is 
satisfied, both receivers can decode corresponding messages 
with the total probability of error tending to zero. Therefore, 
there exists a sequence of good codes for which error 
probability goes to 0. □ 

B. Proof of Theorem \3\ 

Proof. We prove this theorem by showing the code construc- 
tion, encoding, decoding, and error analysis. 

1) Code construction: Fix p(x%) and p(w, x 2 ). Randomly 
and independently generate 2 nRl sequences i"(mi), 7771 G 
[1 : 2 nRl ] i.i.d. according to Yl2=i PXi x U- Also, for each 
xi, randomly and independently generate 2 nR2 sequences 
w n (mi,m 2 )x 2 (mi,m 2 ), m 2 G [1 : 2 nR ' 2 ], with i.i.d. 
elements according to n"=i Pwx 2 w i( m i> r n2)x 2 i(mi, m 2 ). 



2) Encoding: To send messages (mi, m^j, the primary 
and cognitive transmitters, respectively, send the codewords 
i"(mi) and x 2 (mi, TO2). 

3) Decoding: We use joint typicality for decoding, where 
the primary receiver can decode both messages whereas 
the cognitive receiver can only decode m 2 . Decoder 2 
declares that message rfi2 is sent if it is the unique mes- 
sage such that (w"(mi, 7712), ^2 (m-i, 7712), ) G 7e , 
for some mi. Similarly, decoder 1 declares that mes- 
sage mi is sent if it is the unique message such that 

(w n (mi,m2),x%(mi,m2),Xi(rhi),y2) G 7^ n) . In other 
cases, the decoders declare error. 

4) Error Analysis: Error analysis is very similar to that 
of Theorem |2] and is omitted here. □ 

C. Proof of Lemma [7] 

Proof. The Lemma is similar to [5, Lemma 5]. We can write 

I(U;Y 1 \X 1 )=J2p(*i)I{U;Y 1 \X 1 =x 1 ) 

Xl 

= I{U;Y 2 \X 1 ) (21) 

the inequality follows because I(U;Yi) < I{U\Y2) holds 
for all joint distributions p(u, xi, aia). □ 
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